Enhancing N-Gram-Based Summary Evaluation Using Information Content and a Taxonomy
نویسندگان
چکیده
In this paper we propose a novel information-theoretic metric for automatic summary evaluation when model summaries are available as in the setting of the AESOP task of the Update Summarization track of the Text Analysis Conference (TAC). The metric is based on the concept of information content operationalized by using a taxonomy. Hereby, we present and discuss the results obtained at TAC 2009.
منابع مشابه
Using Graph Based Mapping of Co-occurring Words and Closeness Centrality Score for Summarization Evaluation
The use of predefined phrase patterns like: N-grams (N>=2), longest common sub sequences or pre defined linguistic patterns etc do not give any credit to non-matching/smaller-size useful patterns and thus, may result in loss of information. Next, the use of 1-gram based model results in several noisy matches. Additionally, due to presence of more than one topic with different levels of importan...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملAutomatic Summarization from Multiple Documents
This work reports on research conducted on the domain of multi-document summarization using background knowledge. The research focuses on summary evaluation and the implementation of a set of generic use tools for NLP tasks and especially for automatic summarization. Within this work we formalize the n-gram graph representation and its use in NLP tasks. We present the use of n-gram graphs for t...
متن کاملContent Evaluation of Iranian EFL Textbook Vision 1 Based on Bloom’s Revised Taxonomy of Cognitive Domain
Textbooks are considered as the common features of the classrooms and are important means to make contributions to curricula. Therefore, their contents are very essential to develop the adequate curriculum planning. A textbook analysis is a means by which different features of the textbooks can be analyzed and hence their effectiveness is validated. This study set out to evaluate the content of...
متن کاملA Comparison of Word- and Term-based Methods for Automatic Web Site Summarization
Automatic Web site summarization is an effective means of making the content of a web site easily accessible to Web users. We demonstrate that a content-based approach to summarization, which is based on keyword and key sentence extraction from narrative text, is able to generate summaries that are as informative as human authored summaries. This work is directed towards summary generation base...
متن کامل